|
In information theory, the cross entropy between two probability distributions over the same underlying set of events measures the average number of bits needed to identify an event drawn from the set, if a coding scheme is used that is optimized for an "unnatural" probability distribution , rather than the "true" distribution . The cross entropy for the distributions and over a given set is defined as follows: : is the Kullback–Leibler divergence of from (also known as the ''relative entropy'' of ''p'' with respect to ''q'' — note the reversal of emphasis). For discrete and this means : The situation for continuous distributions is analogous: : NB: The notation is also used for a different concept, the joint entropy of and . == Motivation == In information theory, the Kraft–McMillan theorem establishes that any directly decodable coding scheme for coding a message to identify one value out of a set of possibilities can be seen as representing an implicit probability distribution over , where is the length of the code for in bits. Therefore, cross entropy can be interpreted as the expected message-length per datum when a wrong distribution is assumed, however the data actually follows a distribution — that is why the expectation is taken over the probability distribution and not . : : : 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「cross entropy」の詳細全文を読む スポンサード リンク
|